<!DOCTYPE html>
<html>
<!-- Created by GNU Texinfo 7.1.1, https://www.gnu.org/software/texinfo/ -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<!-- Copyright © 1988-2023 Free Software Foundation, Inc.

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with the
Invariant Sections being "Funding Free Software", the Front-Cover
Texts being (a) (see below), and with the Back-Cover Texts being (b)
(see below).  A copy of the license is included in the section entitled
"GNU Free Documentation License".

(a) The FSF's Front-Cover Text is:

A GNU Manual

(b) The FSF's Back-Cover Text is:

You have freedom to copy and modify this GNU Manual, like GNU
     software.  Copies published by the Free Software Foundation raise
     funds for GNU development. -->
<title>RTL passes (GNU Compiler Collection (GCC) Internals)</title>

<meta name="description" content="RTL passes (GNU Compiler Collection (GCC) Internals)">
<meta name="keywords" content="RTL passes (GNU Compiler Collection (GCC) Internals)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<meta name="viewport" content="width=device-width,initial-scale=1">

<link href="index.html" rel="start" title="Top">
<link href="Option-Index.html" rel="index" title="Option Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Passes.html" rel="up" title="Passes">
<link href="Optimization-info.html" rel="next" title="Optimization info">
<link href="Tree-SSA-passes.html" rel="prev" title="Tree SSA passes">
<style type="text/css">
<!--
a.copiable-link {visibility: hidden; text-decoration: none; line-height: 0em}
span:hover a.copiable-link {visibility: visible}
ul.mark-bullet {list-style-type: disc}
-->
</style>


</head>

<body lang="en">
<div class="section-level-extent" id="RTL-passes">
<div class="nav-panel">
<p>
Next: <a href="Optimization-info.html" accesskey="n" rel="next">Optimization info</a>, Previous: <a href="Tree-SSA-passes.html" accesskey="p" rel="prev">Tree SSA passes</a>, Up: <a href="Passes.html" accesskey="u" rel="up">Passes and Files of the Compiler</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<h3 class="section" id="RTL-passes-1"><span>9.6 RTL passes<a class="copiable-link" href="#RTL-passes-1"> &para;</a></span></h3>

<p>The following briefly describes the RTL generation and optimization
passes that are run after the Tree optimization passes.
</p>
<ul class="itemize mark-bullet">
<li>RTL generation

<p>The source files for RTL generation include
<samp class="file">stmt.cc</samp>,
<samp class="file">calls.cc</samp>,
<samp class="file">expr.cc</samp>,
<samp class="file">explow.cc</samp>,
<samp class="file">expmed.cc</samp>,
<samp class="file">function.cc</samp>,
<samp class="file">optabs.cc</samp>
and <samp class="file">emit-rtl.cc</samp>.
Also, the file
<samp class="file">insn-emit.cc</samp>, generated from the machine description by the
program <code class="code">genemit</code>, is used in this pass.  The header file
<samp class="file">expr.h</samp> is used for communication within this pass.
</p>
<a class="index-entry-id" id="index-genflags"></a>
<a class="index-entry-id" id="index-gencodes"></a>
<p>The header files <samp class="file">insn-flags.h</samp> and <samp class="file">insn-codes.h</samp>,
generated from the machine description by the programs <code class="code">genflags</code>
and <code class="code">gencodes</code>, tell this pass which standard names are available
for use and which patterns correspond to them.
</p>
</li><li>Generation of exception landing pads

<p>This pass generates the glue that handles communication between the
exception handling library routines and the exception handlers within
the function.  Entry points in the function that are invoked by the
exception handling library are called <em class="dfn">landing pads</em>.  The code
for this pass is located in <samp class="file">except.cc</samp>.
</p>
</li><li>Control flow graph cleanup

<p>This pass removes unreachable code, simplifies jumps to next, jumps to
jump, jumps across jumps, etc.  The pass is run multiple times.
For historical reasons, it is occasionally referred to as the &ldquo;jump
optimization pass&rdquo;.  The bulk of the code for this pass is in
<samp class="file">cfgcleanup.cc</samp>, and there are support routines in <samp class="file">cfgrtl.cc</samp>
and <samp class="file">jump.cc</samp>.
</p>
</li><li>Forward propagation of single-def values

<p>This pass attempts to remove redundant computation by substituting
variables that come from a single definition, and
seeing if the result can be simplified.  It performs copy propagation
and addressing mode selection.  The pass is run twice, with values
being propagated into loops only on the second run.  The code is
located in <samp class="file">fwprop.cc</samp>.
</p>
</li><li>Common subexpression elimination

<p>This pass removes redundant computation within basic blocks, and
optimizes addressing modes based on cost.  The pass is run twice.
The code for this pass is located in <samp class="file">cse.cc</samp>.
</p>
</li><li>Global common subexpression elimination

<p>This pass performs two
different types of GCSE  depending on whether you are optimizing for
size or not (LCM based GCSE tends to increase code size for a gain in
speed, while Morel-Renvoise based GCSE does not).
When optimizing for size, GCSE is done using Morel-Renvoise Partial
Redundancy Elimination, with the exception that it does not try to move
invariants out of loops&mdash;that is left to  the loop optimization pass.
If MR PRE GCSE is done, code hoisting (aka unification) is also done, as
well as load motion.
If you are optimizing for speed, LCM (lazy code motion) based GCSE is
done.  LCM is based on the work of Knoop, Ruthing, and Steffen.  LCM
based GCSE also does loop invariant code motion.  We also perform load
and store motion when optimizing for speed.
Regardless of which type of GCSE is used, the GCSE pass also performs
global constant and  copy propagation.
The source file for this pass is <samp class="file">gcse.cc</samp>, and the LCM routines
are in <samp class="file">lcm.cc</samp>.
</p>
</li><li>Loop optimization

<p>This pass performs several loop related optimizations.
The source files <samp class="file">cfgloopanal.cc</samp> and <samp class="file">cfgloopmanip.cc</samp> contain
generic loop analysis and manipulation code.  Initialization and finalization
of loop structures is handled by <samp class="file">loop-init.cc</samp>.
A loop invariant motion pass is implemented in <samp class="file">loop-invariant.cc</samp>.
Basic block level optimizations&mdash;unrolling, and peeling loops&mdash;
are implemented in <samp class="file">loop-unroll.cc</samp>.
Replacing of the exit condition of loops by special machine-dependent
instructions is handled by <samp class="file">loop-doloop.cc</samp>.
</p>
</li><li>Jump bypassing

<p>This pass is an aggressive form of GCSE that transforms the control
flow graph of a function by propagating constants into conditional
branch instructions.  The source file for this pass is <samp class="file">gcse.cc</samp>.
</p>
</li><li>If conversion

<p>This pass attempts to replace conditional branches and surrounding
assignments with arithmetic, boolean value producing comparison
instructions, and conditional move instructions.  In the very last
invocation after reload/LRA, it will generate predicated instructions
when supported by the target.  The code is located in <samp class="file">ifcvt.cc</samp>.
</p>
</li><li>Web construction

<p>This pass splits independent uses of each pseudo-register.  This can
improve effect of the other transformation, such as CSE or register
allocation.  The code for this pass is located in <samp class="file">web.cc</samp>.
</p>
</li><li>Instruction combination

<p>This pass attempts to combine groups of two or three instructions that
are related by data flow into single instructions.  It combines the
RTL expressions for the instructions by substitution, simplifies the
result using algebra, and then attempts to match the result against
the machine description.  The code is located in <samp class="file">combine.cc</samp>.
</p>
</li><li>Mode switching optimization

<p>This pass looks for instructions that require the processor to be in a
specific &ldquo;mode&rdquo; and minimizes the number of mode changes required to
satisfy all users.  What these modes are, and what they apply to are
completely target-specific.  The code for this pass is located in
<samp class="file">mode-switching.cc</samp>.
</p>
</li><li><a class="index-entry-id" id="index-modulo-scheduling"></a>
<a class="index-entry-id" id="index-sms_002c-swing_002c-software-pipelining"></a>
Modulo scheduling

<p>This pass looks at innermost loops and reorders their instructions
by overlapping different iterations.  Modulo scheduling is performed
immediately before instruction scheduling.  The code for this pass is
located in <samp class="file">modulo-sched.cc</samp>.
</p>
</li><li>Instruction scheduling

<p>This pass looks for instructions whose output will not be available by
the time that it is used in subsequent instructions.  Memory loads and
floating point instructions often have this behavior on RISC machines.
It re-orders instructions within a basic block to try to separate the
definition and use of items that otherwise would cause pipeline
stalls.  This pass is performed twice, before and after register
allocation.  The code for this pass is located in <samp class="file">haifa-sched.cc</samp>,
<samp class="file">sched-deps.cc</samp>, <samp class="file">sched-ebb.cc</samp>, <samp class="file">sched-rgn.cc</samp> and
<samp class="file">sched-vis.c</samp>.
</p>
</li><li>Register allocation

<p>These passes make sure that all occurrences of pseudo registers are
eliminated, either by allocating them to a hard register, replacing
them by an equivalent expression (e.g. a constant) or by placing
them on the stack.  This is done in several subpasses:
</p>
<ul class="itemize mark-bullet">
<li>The integrated register allocator (<abbr class="acronym">IRA</abbr>).  It is called
integrated because coalescing, register live range splitting, and hard
register preferencing are done on-the-fly during coloring.  It also
has better integration with the reload/LRA pass.  Pseudo-registers spilled
by the allocator or the reload/LRA have still a chance to get
hard-registers if the reload/LRA evicts some pseudo-registers from
hard-registers.  The allocator helps to choose better pseudos for
spilling based on their live ranges and to coalesce stack slots
allocated for the spilled pseudo-registers.  IRA is a regional
register allocator which is transformed into Chaitin-Briggs allocator
if there is one region.  By default, IRA chooses regions using
register pressure but the user can force it to use one region or
regions corresponding to all loops.

<p>Source files of the allocator are <samp class="file">ira.cc</samp>, <samp class="file">ira-build.cc</samp>,
<samp class="file">ira-costs.cc</samp>, <samp class="file">ira-conflicts.cc</samp>, <samp class="file">ira-color.cc</samp>,
<samp class="file">ira-emit.cc</samp>, <samp class="file">ira-lives</samp>, plus header files <samp class="file">ira.h</samp>
and <samp class="file">ira-int.h</samp> used for the communication between the allocator
and the rest of the compiler and between the IRA files.
</p>
</li><li><a class="index-entry-id" id="index-reloading"></a>
Reloading.  This pass renumbers pseudo registers with the hardware
registers numbers they were allocated.  Pseudo registers that did not
get hard registers are replaced with stack slots.  Then it finds
instructions that are invalid because a value has failed to end up in
a register, or has ended up in a register of the wrong kind.  It fixes
up these instructions by reloading the problematical values
temporarily into registers.  Additional instructions are generated to
do the copying.

<p>The reload pass also optionally eliminates the frame pointer and inserts
instructions to save and restore call-clobbered registers around calls.
</p>
<p>Source files are <samp class="file">reload.cc</samp> and <samp class="file">reload1.cc</samp>, plus the header
<samp class="file">reload.h</samp> used for communication between them.
</p>
</li><li><a class="index-entry-id" id="index-Local-Register-Allocator-_0028LRA_0029"></a>
This pass is a modern replacement of the reload pass.  Source files
are <samp class="file">lra.cc</samp>, <samp class="file">lra-assign.c</samp>, <samp class="file">lra-coalesce.cc</samp>,
<samp class="file">lra-constraints.cc</samp>, <samp class="file">lra-eliminations.cc</samp>,
<samp class="file">lra-lives.cc</samp>, <samp class="file">lra-remat.cc</samp>, <samp class="file">lra-spills.cc</samp>, the
header <samp class="file">lra-int.h</samp> used for communication between them, and the
header <samp class="file">lra.h</samp> used for communication between LRA and the rest of
compiler.

<p>Unlike the reload pass, intermediate LRA decisions are reflected in
RTL as much as possible.  This reduces the number of target-dependent
macros and hooks, leaving instruction constraints as the primary
source of control.
</p>
<p>LRA is run on targets for which TARGET_LRA_P returns true.
</p></li></ul>

</li><li>Basic block reordering

<p>This pass implements profile guided code positioning.  If profile
information is not available, various types of static analysis are
performed to make the predictions normally coming from the profile
feedback (IE execution frequency, branch probability, etc).  It is
implemented in the file <samp class="file">bb-reorder.cc</samp>, and the various
prediction routines are in <samp class="file">predict.cc</samp>.
</p>
</li><li>Variable tracking

<p>This pass computes where the variables are stored at each
position in code and generates notes describing the variable locations
to RTL code.  The location lists are then generated according to these
notes to debug information if the debugging information format supports
location lists.  The code is located in <samp class="file">var-tracking.cc</samp>.
</p>
</li><li>Delayed branch scheduling

<p>This optional pass attempts to find instructions that can go into the
delay slots of other instructions, usually jumps and calls.  The code
for this pass is located in <samp class="file">reorg.cc</samp>.
</p>
</li><li>Branch shortening

<p>On many RISC machines, branch instructions have a limited range.
Thus, longer sequences of instructions must be used for long branches.
In this pass, the compiler figures out what how far each instruction
will be from each other instruction, and therefore whether the usual
instructions, or the longer sequences, must be used for each branch.
The code for this pass is located in <samp class="file">final.cc</samp>.
</p>
</li><li>Register-to-stack conversion

<p>Conversion from usage of some hard registers to usage of a register
stack may be done at this point.  Currently, this is supported only
for the floating-point registers of the Intel 80387 coprocessor.  The
code for this pass is located in <samp class="file">reg-stack.cc</samp>.
</p>
</li><li>Final

<p>This pass outputs the assembler code for the function.  The source files
are <samp class="file">final.cc</samp> plus <samp class="file">insn-output.cc</samp>; the latter is generated
automatically from the machine description by the tool <samp class="file">genoutput</samp>.
The header file <samp class="file">conditions.h</samp> is used for communication between
these files.
</p>
</li><li>Debugging information output

<p>This is run after final because it must output the stack slot offsets
for pseudo registers that did not get hard registers.  Source files
are <samp class="file">dwarfout.c</samp> for
DWARF symbol table format, files <samp class="file">dwarf2out.cc</samp> and <samp class="file">dwarf2asm.cc</samp>
for DWARF2 symbol table format, and <samp class="file">vmsdbgout.cc</samp> for VMS debug
symbol table format.
</p>
</li></ul>

</div>
<hr>
<div class="nav-panel">
<p>
Next: <a href="Optimization-info.html">Optimization info</a>, Previous: <a href="Tree-SSA-passes.html">Tree SSA passes</a>, Up: <a href="Passes.html">Passes and Files of the Compiler</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html" title="Index" rel="index">Index</a>]</p>
</div>



</body>
</html>
